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Abstract 

Performing random walks in networks is a fundamental primitive that has found numerous applica- 
tions in communication networks such as token management, load balancing, network topology discovery 
and construction, search, and peer-to-peer membership management. While several such algorithms are 
ubiquitous, and use numerous random walk samples, the walks themselves have always been performed 
naively. 

In this paper, we focus on the problem of performing random walk sampling efficiently in a distributed 
network. Given bandwidth constraints, the goal is to minimize the number of rounds and messages 
required to obtain several random walk samples in a continuous online fashion. We present the first 
round and message optimal distributed algorithms that present a significant improvement on all previous 
approaches. The theoretical analysis and comprehensive experimental evaluation of our algorithms show 
that they perform very well in different types of networks of differing topologies. 

In particular, our results show how several random walks can be performed continuously (when source 
nodes are provided onlyat runtime, i.e., online), such that each walk of length £ can be performed exactly 
in just O(yxD) rounds'] (where D is the diameter of the network), and 0(1) messages. This significantly 
improves upon both, the naive technique that requires 0(1) rounds and 0(f) messages, and the sophisti- 
cated algorithm of lfl4l that has the same round complexity as this paper but requires fl(mvt) messages 
(where m is the number of edges in the network). Our theoretical results are corroborated through exten- 
sive experiments on various topological data sets. Our algorithms are fully decentralized, lightweight, and 
easily implementable, and can serve as building blocks in the design of topologically-aware networks. 
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1 Introduction 

Random walks play a central role in computer science, spanning a wide range of areas in both theory and 
practice, including distributed computing and communication networks. Algorithms in many different appli- 
cations use random walks as an integral subroutine. Applications in communication networks include token 
management EUiJCLQ, load balancing [22), small-world routing [24]], search (361 [H [10J [191 l27l, informa- 
tion propagation and gathering 01231, network topology construction 1 19ll25ll26l . checking expander lfl6l . 
constructing random spanning trees [8j [5] H, monitoring overlays 11291 , group communication in ad-hoc 
network [131 . gathering and dissemination of information over a network Q, distributed construction of ex- 
pander networks lf25l . and peer-to-peer membership management ifTTl 1571 . Random walks have also been 
used to provide uniform and efficient solutions to distributed control of dynamic networks |9 ] . [ 36 ] describes 
a broad range of network applications that can benefit from random walks in dynamic and decentralized 
settings. For further references on applications of random walks to distributed computing and networks, see, 

e.g. una. 

A key purpose of random walks in network applications is to perform node sampling. Random walk- 
based sampling is simple, local, and robust. Random walks also require little index or state maintenance 
which make them especially attractive to self-organizing dynamic networks such as Internet overlay and ad 
hoc wireless networks [9j|36l. In this paper we present efficient distributed random walk sampling algorithms 
in networks that are significantly faster than the existing and naive approaches and at the same time achieve 
optimal message complexity. Our experimental results further show that our techniques perform very well in 
various network topologies. 

While the sampling requirements in different applications vary, whenever a true sample is required from 
a random walk of certain steps, all applications perform the walks naively — by simply passing a token 
from one node to its neighbor: thus performing a random walk of length I takes time and messages that is 
linear with respect to i. Such an algorithm may not scale well as the network size increases and hence it is 
better to investigate algorithms with sublinear time and message complexity. Previous work in |[T4l shows 
how to (partially) overcome this hurdle through a quadratic improvement in time and perform random walks 
optimally, i.e. in 0(yJJD) rounds. However, their algorithm requires a large number of messages for every 
random walk, depending on the number of edges in the network. The algorithm presented here shows how 
to perform the walks with optimal message complexity, i.e. just 0(£) messages per walk amortized, without 
compromising at all on the worst case round complexity. Such algorithms can be useful building blocks in 
the design of topologically (self-)aware networks, i.e., networks that can monitor and regulate themselves in 
a decentralized fashion. (For example, efficiently computing the mixing time or the spectral gap, allows the 
network to monitor connectivity and expansion properties of the network lfl4l .) Further, the previous papers 
(031 US) only considered performing a single walk, or a few walks. Most applications, however, require 
several walks to be performed in a continuous manner. This continuous processing of walks is of specific 
importance in distributed networks and our results are applicable in this general framework. 

Our Contributions 

1. We introduce the problem of continuous processing of random walks. The objective is for a network to 
support a continuous sequence of random walk requests from various source nodes and perform node sam- 
pling to minimize round and message complexity for each request. 

2. We present the first algorithm that is efficient in both round complexity as well as message complexity. 
Our technique and analysis presents almost-tight bounds on the message and round complexity in a widely 
used network congestion model. 

3. We perform comprehensive experimental evaluation on numerous topological networks and highlight the 
effectiveness and efficiency of our algorithm. The experimental results corroborate the theoretical contri- 
butions and show that our random walk sampling algorithm performs very well on various metrics for all 
parameter ranges. 
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Overview. In the remainder of this section, we discuss the network model, related work, and the formal 
notation and problem formulation considered in this paper. We present our algorithms, and message and 
round complexity analyses in Section [2] This rests on some concentration analysis of key random walk 
properties of our algorithm that we then prove in Section [3] Finally, we present extensive experiments on a 
various topological networks in Section [4] 

1.1 Distributed Network Model 

We model the communication network as an undirected, unweighted, connected n-node graph G = (V, E). 
Every node has limited initial knowledge. Specifically, assume that each node is associated with a distinct 
identity number (e.g., its IP address). At the beginning of the computation, each node v accepts as input 
its own identity number and the identity numbers of its neighbors in G. The node may also accept some 
additional inputs as specified by the problem at hand. The nodes are allowed to communicate through the 
edges of the graph G. We assume that the communication occurs in synchronous rounds. We will use 
only small-sized messages. In particular, in each round, each node v is allowed to send a message of size 
0(log n) through each edge e = (v , u) that is adjacent to v. The message will arrive to u at the end of the 
current round. This is a widely used standard model to study distributed algorithms (e.g., see ||34ll33l ) and 
captures the bandwidth constraints inherent in real-world computer networks . Our algorithms can be easily 
generalized if B bits are allowed (for any pre-specified parameter B) to be sent through each edge in a round. 
Typically, as assumed here, B = O(logn), which is number of bits needed to send a node id in a n-node 
network. 

While this is a nice theoretical abstraction, it still does not motivate the most natural practical difficulties. 
A well established concern with this model is that for simple operations, the entire network may spawn a 
large number of parallel messages in order to minimize rounds. This can be very expensive from a practical 
standpoint. A critical component in the analysis of practical algorithms is the overall message complexity per 
execution of any algorithm. This becomes even more crucial from the standpoint of continuous processing 
of algorithms, perhaps even in parallel. Therefore, the goal is to design algorithms that have a low amortized 
message complexity and minimize the worst case round complexity, both simultaneously. Due to their con- 
flicting nature, few algorithms perform well on both metrics. In this paper we present an algorithm that is 
near-optimal in terms of messages as well as rounds in parallel. 

1.2 Related Work and Problem Statement 
Applications and Related Work 

Random walks have been used in a wide variety of applications in distributed networks as mentioned previ- 
ously. We describe here some of the applications in more detail. 

Morales and Gupta l29| discuss about discovering a consistent and available monitoring overlay for a 
distributed system. For each node, one needs to select and discover a list of nodes that would monitor it. 
The monitoring set of nodes need to satisfy some structural properties such as consistency, verifiability, load 
balancing, and randomness, among others. This is where random walks come in. Random walks are a 
natural way to discover a set of random nodes that are spread out (and hence scalable), that can in turn be 
used to monitor their local neighborhoods. Random walks have been used for this purpose in another paper 
by Ganesh et al. IPT71 on peer-to-peer membership management for gossip-based protocols. Morales and 
Gupta ll30l l3TTl have several more papers in their line of work on AVMON system and similar systems that 
use several continuous node samples as a way to monitor distributed systems. 

Speeding up distributed algorithms using random walks has been considered for a long time. Besides our 
approach of speeding up the random walk itself, one popular approach is to reduce the cover time. Recently, 
Alon et. al. show that performing several random walks in parallel reduces the cover time in various 
types of graphs. They assert that the problem with performing random walks is often the latency. In these 
scenarios where many walks are performed, our results could help avoid too much latency and yield an 
additional speed-up factor. 
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A nice application of random walks is in the design and analysis of expanders. We mention two results 
here. Law and Siu [25 ] consider the problem of constructing expander graphs in a distributed fashion. One 
of the key subroutines in their algorithm is to perform several random walks from specified source nodes. 
Dolev and Tzachar |[T6ll use random walks to check if a given graph is an expander. The first algorithm 
given in |[T6ll is essentially to run a random walk of length n log n and mark every visited vertex. Later, it 
is checked if all vertices have been visited. Broder [8] and Wilson [i35j gave algorithms to generate random 
spanning trees using random walks and Broder's algorithm was later applied to the network setting by Bar- 
Ilan and Zernik [5]. Recently Goyal et al. ll20l show how to construct an expander/sparsifier using random 
spanning trees. A variety of such applications can greatly benefit using the random walk sampling techniques 
presented in this paper. 

Notation and Problem Statement 

The basic problem we address is the following. We are given an arbitrary undirected, unweighted, and 
connected n-node (vertex) network G = (V, E) and a random walk request length £. The goal is to devise a 
distributed algorithm such that, the algorithm continuously accepts source node inputs s, and after each input, 
the algorithm performs a random walk of length £ and outputs the ID of a node v which is randomly picked 
according to the probability that it is the destination of a random walk of length I starting at s. Throughout 
this paper, we assume the standard random walk: in each step, an edge is taken from the current node x with 
probability proportional to l/deg(x) where deg(x) is the degree of x. Our goal is to output a true random 
sample from the ^-walk distribution starting from s. Once the sample has been output, a new request is issued 
to the algorithm, and this proceeds in a continual manner. The objective is to minimize the round complexity 
as well as the message complexity for each of these requests. 

For clarity, observe that the following naive algorithm solves the above problem for a single request in 
0(£) rounds and 0(£) messages: The walk of length £ is performed by sending a token for I steps, picking a 
random neighbor with each step. Then, the destination node v of this walk sends its ID back (along the same 
path) to the source for output. Our goal is to perform such sampling with significantly less number of rounds. 
At the other extreme is the series of work |[T3l[T4ll32l where the round complexity of this single random walk 
request was heavily optimized by a cleverer algorithm. We mention details about this below, but in essence 
they proposed an approach to perform this walk in 0{y/~£D) rounds but in exchange incurred a message 
complexity of Q(mVI). We would like to improve the messages significantly from here. In particular, we 
would like the best of both these two extreme worlds, and also support continuous requests in the process. 

The problem of performing just one walk was proposed in [ 13 ] under the name Computing One Random 
Walk where Source Outputs Destination (1-RW-SoD) (for short, this problem will simply be called Single 
Random Walk in this paper), wherein the first sublinear time distributed algorithm was provided, requiring 
0(^ 2//3 -D 1//3 ) rounds (O hides polylog(n) factors); this improves over the naive 0(£) algorithm when the 
walk is long compared to the diameter (i.e., £ = Q,(D polylogn) where D is the diameter of the network). 
This was the first result to break past the inherent sequential nature of random walks and beat the naive £ 
round approach, despite the fact that random walks have been used in distributed networks for long and in 
a wide variety of applications. It was further conjectured in [13] that the true number of rounds for this 
problem is O(VlD). 

The high-level idea used in the 0(£ 2 ^ 3 D 1 ^ 3 )-round algorithm in |[L3l is to "prepare" a few short walks 
in the beginning (executed in parallel) and then carefully stitch these walks together later as necessary. The 
same general approach was introduced in [12] to find random walks in data streams with the main motivation 
of finding PageRank. However, the two models have very different constraints and motivations and hence 
the subsequent techniques used in |[T3l and [12] are very different. The algorithms in lfl4l use the same 
general approach as lfT3l but exploit certain key properties of random walks to design even faster sublinear 
time algorithms; in particular, they show how a random walk can be performed in 0{s/£D) rounds. It was 
then shown in [32] that these techniques are optimal in round complexity for performing a single random 
walk. 
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None of these papers considered message complexity though, and did not consider the problem of con- 
tinuously processing random walk requests. Our current paper is the first to ask about amortized message 
complexity under this continuous framework and our algorithms continue to hold worst case optimality in 
round complexity as well. 

2 Theoretical Analysis of Algorithms 

2.1 Algorithm descriptions 

We first describe the algorithm for single random walk in [14J and then describe how to extend this idea 
for continuous random walks. The current algorithm is also randomized and we focus more on the message 
complexity. The high-level idea for single random walk is to perform many short random walks in parallel 
and later stitch them together |[T3l[T4ll . Then for multiple random walks we choose the source node randomly 
each time and perform single random walk using the same set of short length walks. 

Our main algorithm for performing continuous random walk each of length I is described in CONTINUOUS- 
RANDOM-WALK (cf. Algorithm [3]>. This algorithm uses other algorithms PRE-PROCESSING (cf. Algo- 
rithm []} and Single-Random-Walk (cf. Algorithm [2]). The Pre-Processing function is called only 
one time at the beginning of Continuous-Random-Walk, to perform rjd(v) logn short walks of length 
A from each vertex v; once these pre-processed short walks are insufficient to answer a single random walk 
request, only then is the pre-processing table reconstructed and the algorithm resumes answering single ran- 
dom walk requests accessing the short length walks from the new table. At the end of PRE-PROCESSING, 
each vertex knows the destination IDs of the short walks that it initiated. 

2.2 Previous Results - Rounds and Messages 

We first restate the main round complexity theory for Single-Random-Walk and also state the message 
complexity of this algorithm. 

Lemma 2.1 (Theorem 2.5 in HH). For any t, Algorithm S INGLE-RANDOM- WALK (cf. Theorem 2.5 in /I7?l/) 
solves the Single Random Walk Problem and, with probability at least 1 — ^ , finishes in (Ar/ log n + ^p) 
rounds. 



Algorithm 1 Pre-Processing(?7, A) 

Input: number of short walks of each node v is rjdeg(v) log n, and desired short walk lengths A. 
Output: set of short random walks of each nodes 

Each node v performs rj v = rj deg(v) log n random walks of length A + where r« (for each 1 < i < 77) 
is chosen independently at random in the range [0, A — 1]. 

1: Let r max = maxixj^ r^, the random numbers chosen independently for each of the r\ x walks. 

2: Each node x constructs rj x messages containing its ID and in addition, the i-th message contains the 

desired walk length of A + r«. 
3: for % = 1 to A + r max do 

4: Each node v does the following: Consider each message M held by v and received in the (i — l)-th 
iteration (having current counter i — 1). If the message M's desired walk length is at most i, then 
v stored the ID of the source (v is the desired destination). Else, v picks a neighbor u uniformly at 
random and forward M to u after incrementing its counter. 

5: end for 

6: Send the destination IDs back to the respective sources (this can be done by sending the destination IDs 
along the "reverse" path). 
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Algorithm 2 S ingle-Random- Walk(s, £) 



Input: Starting node s, and desired walk length I. 
Output: Destination node of the walk outputs the ID of s. 

Stitch Q(£/X) walks, each of length in [A, 2A - 1] 

1: The source node s creates a message called "token" which contains the ID of s 

2: The algorithm generates a set of connectors, denoted by C, as follows. (Connectors are the endpoints of 

the short walks, i.e., the points where we stitch.) 
3: Initialize C = {s} 

4: while Length of walk completed is at most £ — 2A do 
5: Let v be the node that is currently holding the token. 

6: v uniformly chooses one of its short length sample and let v' be the sampled value if any exists (which 

is a destination of an unused random walk of length between A and 2A — 1). 
7: if v' = NULL (all walks from v have already been used up) then 
8: Algorithm terminates failing this walk. 
9: end if 

10: v sends the token to v' 

11: C = CU{v} 

12: end while 

13: Walk naively until £ steps are completed (this is at most another 2A steps) 
14: A node holding the token outputs the ID of s 



Algorithm 3 Continuous-Random-WalkCO 
Input: I. 

Output: Continuous I length random walk samples from sources nodes presented adversarially or randomly. 
Source nodes S: The source node of each walk of length I can be presented adversarially or randomly 
accordingly to some distribution. Let this continuous sequence of source nodes be denoted by ordered set 
S. 

l: Call Pre-Processing(77 = 1, A = 24\/ZD(logn) 3 ) 
2: while Indefinitely do 

3: while Algorithm does not fail (algorithm gets stuck due to insufficient short walks) do 

4: Select the next source node s from the ordered set S. 

5: call Single-Random-Walk(s, £). 

6: If Single-Random-Walk returns with fail, exit loop 

7: end while 

8: Call PRE-PROCESSING^ = 1, A = 24v / ZD(logn) 3 ) again and use this table. 
9: Rerun request for s and then continue subsequent walks based on random samples. 
10: end while 



Lemma 2.2. The message complexity o/Single-Random-Walk is O (riXm log n + ^jp) where m is num- 
ber of edges and D is the diameter of the network. 

Proof. For computing rjdeg{v) log n short walks of length A it uses @(Xr]deg(v) log n) messages. Since for a 
single short walk of length A it sends A messages and hence for n nodes it requires 6 (A?? log n deg{v)) = 
Q(Xrimlogn) messages. For stitching one short walk with another we need to contact the destination ID. 
This can be done quickly by using a BFS tree. Note that the BFS tree needs to be constructed only once F] 
(G(m) messages) and each stitch uses 0(D) messages. Combining these, the lemma follows. □ 

2 If we assume that nodes have access to shortest path routing table, then BFS tree is not needed. 
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In networks such as P2P or overlay networks, if we assume that a node can access quickly (in constant 
time) another node whose ID (IP address) is known, then one can improve the time and message complexity 
of stitching, saving a @(D) factor. 

We now analyze the round and message complexity for Continuous-Random-Walk algorithm in 
the next two subsections. For simplified analysis, we use k to denote the fraction of short walks of the pre- 
processing table that get used, before the algorithm fails and needs to rerun the pre-processing stage. The next 
two subsections assume a value of k and prove bounds using it. In the following section, we actually present 
bounds on k itself to arrive at the main theorem of this paper. To recall other notation, rj v = rjdeg(y) log n is 
the number of short length walks pre-processed for each node v, A is the length of these short walks, n is the 
number of nodes, m is the number of edges, and D is the diameter of the network. 

2.3 Round Complexity 

Lemma 2.3. For any £, Algorithm CONTINUOUS-RANDOM-WALK (cf. Algorithm^ serves continuous 
random walk requests such that, with probability at least 1 — ^ , the total number of rounds used until PRE- 
PROCESSING needs to be invoked for a second time is O (Xr/ log n + Kmr/D log n), where k is the fraction 
of used short length walks from the preprocessing table. 

Proof. The proof is same as Theorem 2.5 in Q3] for Single Random Walk; the only difference is we are 
doing continuous walks of same length £. Therefore for Continuous Walks, if k is the fraction of used short 
length walks form the preprocessing table, then a total 0(nmr] log n) short walks are used. Hence we need to 
stitch 0(Kmrj log n) times and therefore by Lemma 2.3 in lTT4l . contributes 0(Kmr]lognD) rounds. Hence 
total O (Xrj log n + KmrjD log n) rounds. □ 

Corollary 2.4. The average number of rounds per random walk of length I of CONTINUOUS-RANDOM- 
WALK (cf. Algorithm J?|) is O + ^)) with high probability. 

Proof. The total number of random walks of length i that have been completed successfully by CONTINUOUS - 
RANDOM-WALK is 6 ( Kmr? ^ logra ), as total 0{nmrj\ogn) short walks each of length A have been used. 



Hence the bound on the average number of rounds per walk follows. □ 
2.4 Message Complexity 

Lemma 2.5. The message complexity o/CONTlNUOUS-RANDOM-WALK, until PRE-PROCESSING needs to 
be invoked for a second time, is O (r]Xm log n + KmrjD log n) where k is fraction of used short length walks 
from the preprocessing table. 

Proof. The message complexity of the stage of PRE-PROCESSING is as before. Further, for each subsequent 
£ length walk request, an additional 0(D£/X) messages are used. Also, as before we know that the total 
number of random walks of length £ that have been completed successfully by CONTINUOUS-RANDOM- 
Walk is ^ Km?? ^ logn ^, as total 0{nmri log n) short walks each of length A have been used. Therefore 

the contribution from this towards the total message complexity is 0(D£/X * Km ^ logn ) which reduces to 
0(mDr]K log n). Combining these, the lemma follows. □ 



Corollary 2.6. The average number of messages per random walk of length £ o/CONTINUOUS-RANDOM- 
Walk/sO (£(1 + ^f)). 



Proof. From the above Lemma 2.5 we know that the total number of messages used for computing all walks 
of Continuous-Random-Walk is O (r/Amlogn + nmrjDlogn). Now the total number of walks of 
length £ is O ( Kmr > x l °g n j ; as total O(Kmrj) short walks each of length A. Hence we get the average number 



of messages per walk by dividing by this. □ 
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Combining the above two corollaries, we get the following. 

Lemma 2.7. The average number of rounds and messages per random walk of length I o/CONTlNUOUS- 
Random-Walk (cf. Algorithm^ are O (j. + and O (|(1 + ^)) respectively. 

Corollary 2.8. For our choice of\ = Q(y/£D), the average rounds and messages per random walk becomes 
d(lL + + D) and O (f; + D) respectively. 



Kill 



+ V£D 



Proof. If we put A = G(VZD) in Lemma|r7j then the average round and message becomes O 
and O ^ + ViTfj respectively. Now \J~ID < £ + D. So the corollary follows. □ 

Note that, in the above corollary, k < 1 can be small, so that the bounds can become large. We show in 
the next section that k is a constant and hence our bounds are almost optimal. 

3 Concentration Bounds on k 

The goal of this section is to present a lower bound on k, the fraction of rows of the pre-processing table 
(or the fraction of all short walks) that get used before the algorithm fails to perform a random walk re- 
quest, and needs to rerun the pre-processing stage. All the analysis in this section assumes that sources S 
in Continuous-Random-Walk are sampled according to the degree distribution. While the algorithm 
Continuous-Random-Walk remains meaningful otherwise also, our proofs crucially rely on this ran- 
dom sampling of sources. Obtaining similar theorems for more general sequence of sources in S remains an 
open question. 

We now present the central theorem that lower bounds k. 

Theorem 3.1. Given any graph G, if CONTINUOUS-RANDOM-WALK is invoked on I = 0(m) and the 
source nodes S are chosen randomly proportional to the node degrees, then the algorithm uses up at least 
K = 0(1) fraction of all short walks in PRE-PROCESSING table, before a request fails and a second call 
needs to be made to PRE-PROCESSING. 

Proof. Assume for now that we do d(v) short walks for each vertex v. The total number of walks of length 
£ is T = if all the short walks are used. Let K = aT, where a is a constant in [0,1]. Note that if we 
manage to perform K walks of length I, then we have utilized a constant fraction of the short walks. For one 
^-length walk, in expectation a vertex v can be a connector at most times (by linearity of expectation). 
(Connectors are the endpoints of the short walks, i.e., the points where we stitch. Note that only when a 
vertex is visited as a connector we end up using a short walk initiated from that vertex.) Then for K walks, 
each of length I, the expected number of times that v is visited as a connector vertex is = ad{v). Let 

N denote the number of times the vertex v is visited as a connector in K walks. By above, E[N] = ad(v). 
By Markov's inequality, Pr (N > d(v)) < = a. Now consider the above experiment (for a fixed 

vertex v) repeated clogn independent times for some constant c, that is suitably large. (In other words, 
assume that we do cd(v) log n short walks — total over all experiments — from each node v.) We say that 
an experiment is "success" if N < d(v). If we have success, then that means that we have done K walks of 
length £ (and hence utilized a constant fraction of the d{v) short walks) for that experiment before a request 
fails. By above, the probability of success is at least some constant oi = 1 — a. Let Xf, X%, . . . , X^ logn 
be the 0- 1 indicator random variables such that Xf = 1 (if success occurs in i-th time) and zero otherwise. 
Let X v = ^2i=f n X^. Then E'pP] = a'clogn. Since the variables are independent, by Chernoff's 

2c' 2 log 2 n 

bound PrdX" — E'fX 1 ']! > c'logn) < e cl °s n < \, for a suitable constant c < (c') . Therefore, 
Prd-X^ — E'fX"]) > c'logn) < 4y. Thus, at least a constant fraction of the experiments succeed with 
probability 1 — 1/n 2 . By union bound [28], the total number of visits to every vertex v as connector in all 
(clogn times K) walks is at most (D(ad(v)c log n) with probability at least 1 — 1/n. This implies that the 
total number of short walks utilized is a constant fraction of the best possible, before a request fails. □ 
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We now present the main theorem of this paper, which follows from the above bound on k stated in 
Theorem |3.1| and the message and round complexity bounds in terms of k stated in Corollary |2.8| Notice that 
this presents optimal round and message complexities simultaneously for every walk (since independently 
also Q(£ + D) is a clear lower bound on the number of messages for a single ^-length random walk, and 
U(y/£D + D) is a nontrivial lower bound on the number of rounds for a single ^-length random walk as 
shown in [32]). 

Theorem 3.2. Algorithm CONTINUOUS-RANDOM-WALKS satisfies walk requests continuously and indef- 
initely such that the amortized message complexity per walk is 0{£ + D), and, with high probability, every 
single walk request completes in 0{\f£~D + D) rounds. 

3.1 Extensions to different walk lengths 

While our main algorithm of CONTINUOUS-RANDOM-WALK and the associated theorems are stated for 
a fixed £, they can call be generalized to handle different walk lengths. We omit the rigorous details for 
brevity and present a brief explanation of the generalization here. The theorems and experiments go through 
verbatim for this case as well. 

Suppose that Continuous-Random-Walk is designed to not only support new source node requests 
each time but also new length requests for the random walks. One can of course store multiple PRE- 
PROCESSING tables, one for each associated ti, in the entire allowed range for £. This way, when a request 
is presented, the appropriate PRE-PROCESSING table is accessed and the corresponding short walks queried. 
Then, whenever Continuous-Random-Walk fails on a specific single random walk request, only this 
PRE-PROCESSING is rerun, and answering the random walk requests resume. 

While this does solve the problem and guarantees the identical throughput and efficiency, a practical 
concern is that performing and storing so many short walks, corresponding to multiple different lengths, 
can be expensive. There is a simple way to counter this, by storing short walks in a doubling fashion. 
In particular, if each £i was in the range [l,n], instead of storing short walks corresponding to each of 
£i = 1, 2, 3, . . . , n — 1, re, we perform short walks only corresponding to £i = 1, 2, 4, ... , re/2, re. This 
exponentially reduces the number of short walks at each node, or the number of pre-processing tables, from 
re to log n. Now, whenever a walk request for £{ is received, it can be answered by just performing a longer 
walk, of length £i such that £i < £i < 2£{. 

4 Experiments 

We have used the following five important graph generative models for experiments. Several of these have 
been used in other papers as well for random walk experiments, see for e.g. fT8l . These graphs together cover 
a nice spectrum of fast mixing to slow mixing, uniform degrees to very skewed degrees, small diameter to 
large diameter, etc. thereby testing the algorithm in all the extreme cases as well as nice cases. 

• Regular Expander: We worked on the most commonly studied random graph model of G(n, p). Here, 
each of the n(n — l)/2 edges occurs independently and randomly with probability p. We choose p 
as log n/n so that the expected number of edges is roughly (relogn)/2. Further, the expected degree 
of every vertex is log n. This, with high probability, results in a graph with good expansion and it is 
regular in expectation. 

• Two-tier topologies with clustering: First we construct four isolated roughly regular expanders, as 
mentioned above in C7(n, p), of the same size - think of these as independent clusters. Then from each 
cluster we pick a small number of nodes (roughly one-fourth the size of the cluster and connect them 
using another G(n, p) - think of this as a tier-two cluster. Again we use the same value of p as above. 

• Power- law graphs: In distributed settings, many important networks are known to have power-laws. 
We use the well known preferential attachment growth model to construct random power-law graphs. 
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The essential process proceeds by starting with a small clique (of same 5 nodes), and then adding 
vertices sequentially. Each subsequent vertex added connects with an edge to each of the previous 
edges with probability depending on their degrees, and independently. Specifically, the new vertex 
connects with a previous vertex v with probability proportional to deg{v) a where the exponent a is a 
parameter. 

• Random Geometric Graph: A random geometric graph is a random undirected graph drawn on a 
bounded region [0, 1) x [0, 1). It is generated by placing n vertices uniformly at random and indepen- 
dently on the region (i.e. both the x and y coordinates are picked uniformly and independently). Then 
edges are constructed deterministically - two vertices u and v are connected by an edge if and only if 



the distance between them is at most a parameter threshold r. We choose r as W so that the degree 



of each vertices is 0(log n) w.h.p. 

• Grid Graph: Consider a square grid graph {^fn x ^fn) which is a Cartesian product of two path graphs 
with y/n vertices on each. Since a path graph is a median graph, the square grid graph is also a median 
graph. All grid graphs are bipartite (since they have no odd length cycles). 

We compute and maintain a preprocessing table containing r/ v = i]deg(v) logn short walks of length A 
from each vertex v. We then check how many walks of length I can be done using this table before we hit a 
node all of whose short walks have been exhausted. The source nodes for each of the £ length random walk 
requests are sampled randomly according to the degree distribution. 

We perform experiments on each of the aforementioned synthetically generated graphs, and also by 
varying different parameters. In particular, we conduct separate experiments for each of (a) varying the length 
of the walk (£) as a function of n, (b) varying the number of nodes(n), (c) varying the length of the short 
walk (A) stored by the preprocessing table, and (d) varying the number of short walks stored from each node 
as a function of the parameter rj. For each of these, we use certain default values when a specific parameter 
is being varied for a plot, while others are held constant. The default values we use are n = 10, 000, £ = n, 
r\ = 1, and A = logn. 

Since we are interested in how many random walks of length I can be done in a continuous manner 
with small round and message complexity, this translates to analyzing the utilization for one specific pre- 
processing table before CONTINUOUS-RANDOM-WALK gets stuck and needs to invoke another call to PRE- 
PROCESSING. In particular, to analyze the round complexity, we conduct a set of experiments to evaluate k, 
the number of rows of the PRE-PROCESSING table used before the algorithm fails (k plotted on the y-axis). 
As mentioned in the previous section, this gives a bound of l/n on the round complexity. In particular, 
if k is a constant, and large enough, this shows excellent utilization and an asymptotically optimal round 
complexity. Similarly, for message complexity, we explicitly conduct a second set of plots that calculates the 
message complexity on the y-axis based on k and D, for easier visualization. 

We plot graphs by varying each of the parameters £, n, A, r\. Each figure contains fives lines, one for each 
of the above network models: For each of these plot values, we perform ten different runs and then present 
the average value. 

4.1 Short walk utilization factor n 

Varying £ [Figure[IJ: Here n is fixed at 10,000 and £ is varying as n 5 , n 6 , n 12 ; A is \fl and rj is log n. 
In this case we see that at least 50% of the pre-processed short walk rows are used up. This utilization is even 
better for some of the graph topologies such as G(n,p) and two-tier clustering graph and reaches around 
80%. Therefore, for the entire range of £ being small to very large, our algorithm performs extremely well: 
In particular, k is a large constant and therefore the round complexity and message complexity are close to 
optimal - i.e. within a constant factor of the best possible. 

Varying n [Figure[2|: The number of nodes n is varying between 1000 and 10,000. We see that in all of the 
graphs, the utilization of pre-processed short walks before the algorithm terminates is at least 60%. We also 
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see that in some of the graphs, the utilization is substantially higher. There even as the graph size scales, our 
performance remains equally good. 
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Figure 3: varying number of short walks -q. n = 10K, i = n, A = \fl 

Varying rj [Figure |3j : We see that the used fraction of rows is increasing with the number of short length 
walk r\. We see that even for small enough rj of 1, on all algorithms, the utilization k on the y-axis is at least 
0.6, or 60% of all the short walks get used. This means that for each node v, the number of short length walk 
d(v) logn suffice, therefore the round and message complexity remain near-optimal as proved previously. 
Varying A [Figure^: The default value of A is \ft. In this plot, we vary A from 0.25\/l to \fl in doubling 
steps. We see that the utilization roughly remains the same throughout the plot. Even though the algorithm 
needs to choose A to optimize for rounds and messages, this plot shows that for any of the values, it performs 
well. 

Summary of observed round complexity: To summarize the plots for varying different parameters on the 
x-axis, we see that in all the plots, the value of k on the y-axis is a constant and usually at least 0.5. Since k 
is 1 for optimal or perfect utilization of the table, we see that for all parameter values, the utilization is only 
a small constant factor (around 2) away from the optimal. Therefore, the round complexity, as proven in the 
previous section, increases only marginally. 
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4.2 Message complexity plots 

Varying £ [Figure|5J: In this plot, we vary £ and note the message complexity of the algorithm CONTINUOUS- 
Random-Walk, per random walk Single-Random-Walk request within it. For any walk £ the optimal 
number of messages would be £ itself. Notice that in our plot also, all the lines (that is for all the graphs) are 
very close to the x = y line, which is the optimal line. Therefore, the efficiency of CONTINUOUS-RANDOM- 
Walk amortized is almost the best possible. 

Varying n [Figure |6}: In this plot as well, since we use the default value of £ = n, the best possibility is for 
the message complexity to be n, which corresponds to the x = y line. Notice that again for all the graphs, 
the lines for message complexity, through the entire range, is almost the best possible; this is because we get 
straight lines with the slope being very close to x = y line. 

Varying rj [Figure [Tj: As i] is increased between 0.25 and 4, we see that the message complexity reduces 
rapidly. It is expected that as the number of pre-processing rows are increased, the efficiency would improve 
and therefore message complexity also improves. This plots sharp decline, however, also suggests that just a 
small enough rj is also sufficient to drastically bring down the message complexity close to optimal, regardless 
of what the graph topology is. 

Varying A [Figure [8j: This plot is very similar to that of varying rj, here we see again that as A is increased, 
the message complexity goes down rapidly. Recall that here we are comparing different A values for a fixed 
£ value of n. Our algorithm CONTINUOUS-RANDOM-WALK uses A = yfl but we tried this plot with even 
smaller values of A. As expected, the message complexity is high initially, however, as A is increased close 
to yl, the message complexity rapidly reduces, and improves the algorithm performance substantially. 
Summary of observed message complexity: In the naive approach, while each random walk requires 
0(1) messages, the round complexity is increased significantly. At the other extreme, each random walk 
in JT4l was round-efficient but required £l(m) messages! Our algorithm of CONTINUOUS-RANDOM-WALK 
achieves the best of both worlds by guaranteeing best-possible message and round complexity for graph 
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topologies. The experiments suggest that for a wide range of parameters, the algorithm is able to answer 
each Single-Random- Walk request in a continuous manner with very few messages or rounds. These 
results corroborate our theoretical guarantees and highlight the practicality of our technique. 

5 Conclusion 

We present near-optimal distributed algorithms for random walk sampling in networks. Since node sampling 
is useful in various networking applications, our algorithms can serve as building blocks in a variety of 
distributed networking applications. 
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